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Note -5p. 

“LnPo;s s.cage, Research P.oble.s, ^esaarcK 

W reanalysi's of data collected in large research projects presents problem., 
mainly because the hypotheses to be tested may dependent on the hypo^he^^^ o^ 
the original project. One possible solution to this problem would be to u=e the data to 
group subjects into typologies. (HW) 
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REANALYSIS OF DATA FILES: DEPENDENT HYPOTHESES 

AND A RECOMiOTDATION 

Harvey F. Dingraan, Ph.D. 
and 

Robert F. Peck, Ph.D. 

University of Texas 

Abstract 

The rcanalysis of data collected in large research projects T^resents 
problems. It appears that the hypotheses to be tested probably will not 
be independent. If the data arc used to group the subjects into typolo- 
gies, subr.cqucnt research would be improved. 
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AND A RECOUmNDATION 

Kcrvey F. Dingiaan, Ph.D, 
and 

Robert F. Peckp Ph.D. 

University of Texas 

Large scale research projects often collect valuable pools of data 
that are not completely analyzed, Much of the data gathered on large 
projects consist of incidental or background measures that are not re- 
lated to the major hypotheses being tested. This large pool of data 
frequently describes a population well enough to be the basis for future 
studies. At the very least, these data pools provide the normative basis 
for future experimentation. 

While the existence of large data banks has been publicised by organi- 
zations (Glaser 19o7) the use of the data poses formidable problems. 

Strictly speaking, if the data file were considered to be a random collection 
of data representing a population, then the pool could be sampled an infi- 
nite number of times with an infinite number of hypotheses. When the pro- 
bability level to be attached to each hypothesis should be modified by the 
number of hypotheses tested with those data is not clear. This is particu- 
larly ambiguous if the data are sampled by independent investigators with 
independent hypotheses, 

A more practical problem arises when a research project is analyzing a 
set of data for several purposes. The Austin, Texas, public school system 
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has been studle<d for three years by the Research and Beveloptcent Center 
at the University of yej^es at Austin (Peck, 1968)* The teachers who 
were taught in the CKperimsntal teacher education program at the Univer- 
sity of Texas have been studied extensively in order to evaluate their 
training program. A proposal to lieik the pupils In the schools to the 
teachers who have received the expeirimental training re<iutres the Inter- 
locking of two data pools for research on socae new and previously unplanned 
comparisons. The hypotheses derived from the combined data pool would not 
ba independent of previously tested hypotheses* Operational use of pre- 
viously analyaed variables under new names does not incrcai^f the information 
extracted from the data* 

The principle that seems apparent la that as long as the new hypo- 
theses and the new comparisons ere related operationally to the data 
that had been dcr/.ved from the data pool, then the new comparisons are 

t 

not unbiased. If the number of statistical tests approached is one- 
half the number of variables and more than fifty per cent of the sub- 
jects on each variable, then the only appropriate procedure is to con- 
sider the dvata pool as giving rise to descriptive Information which can 
be used for devising better experiments r The data from the reanalysls of 
the pool should be especially good for the development of typologies and 

the evaluation of typological schemas. 

The analysis of the file of teachers and pupils should give rise 
to types of teachers and types of pupils. Efficient new experiments 
could then be planned to take advantage of the typological ccheme in 
order to maxirniise the piobabillty of finding significant differences in 
the new experiments. The replacement of individual variables by typo- 
logical groupings can enhance the information gained from the experlroenta 
as has been shown in the studies with the mentally retarded (Dlngman, 1964). 



In a series of publications (Dingiaan, Dlngman and Miller, Miller 
et al. 1961, Miller 1962), a simple typology has been developed and 
studied. In subsequent papers this typology has been shown to be a 
formal refi^:at¥3r««sot in mathceiatical tersss of e categorization that has 

been veil recognised (l^ingman and Tarjan, Tarjen ©t al. 1961). This 

*-# 

simple typology, ^^hlle fairly ©bviou®, sen have important new uses 
aince the reorganization of the data and the subjects into objective 
formal categories (Eyman, in press) can be used in potierful ways. Some 
of these include the predicting the probability of death and disease for 
specific patients in an institution (Dlngman, et al., 1964). The reuse 
of the data in a new form affected the careers of mentally retarded 
patients. 
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